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AMENDMENT TO THE CLAIMS 
1 . (Currently Amended) A method of compressing a log of linguistic data, the log having a 
plurality of linguistic help query strings, each string beiftg-including at least two tokens, the 
method comprising: 

applying a compression operation to each string; 

determining if any two strings match each other after the compression operation; 
and 

removing one of the two matching strings from the log. 



2. (Currently Amended) The method of claim 1, wherein the log is a log o f queries user- 
initiated inputs to a help interface . 

3. (Currently Amended) The method of claim 2, wherein the queries are each string is a 
query queries relative to a help function. 

4. (Currently Amended) The method of claim 3, wherein the- each help-related queryies 
areis. relative to a computer system. 

5. (Original) The method of claim 1, wherein the compression operation is character-based. 

6. (Original) The method of claim 1, wherein the compression operation is token-based. 

7. (Original) The method of claim 1, wherein the compression operation is subsumption. 

8. (Original) The method of claim 7, wherein subsumption includes applying an impossibility 
condition to selectively compute edit distance. 



9. (Original) The method of claim 1, and further comprising: 
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applying a second compression operation to each string; 

determining if any two strings match each other after the second compression 

operation; and 
removing one of the two matching strings from the log. 

10. (Original) The method of claim 9, wherein the first compression operation is character-based 
and the second compression operation is token based. 

11. (Original) The method of claim 10, and further comprising applying subsumption after the 
second compression operation is complete. 

12. (Original) The method of claim 11, wherein the subsumption operation is repeated for the 
log. 

13. (Original) The method of claim 1, and further comprising training a statistical process with 
the compressed log. 

14. (Currently Amended) A system for compressing a query log having a plurality of 
linguistic help query strings, each string having a plurality of tokens, the system comprising: 

an input for receiving a raw query log; 
memory for storing the raw query log; 

a processor for applying at least one compression operation to each string, and for 
scanning the modified strings to determine if any match each other so that 
one of the matching strings can be removed; and 

an output for providing a compressed query log once the removal is complete. 

15. (Currently Amended) The system of claim 14, wherein th e queries ar e queri e s each string 
is a query relative to a help function. 
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16. (Currently Amended) The system of claim 15, wherein the -each help-related queryies 
aras_ relative to a computer system. 

17. (Original) The system of claim 14, wherein the at least one compression operation is 
character-based. 

18. (Original) The system of claim 14, wherein the at least one compression operation is token- 
based. 

19. (Original) The system of claim 14, wherein the at least one compression operation includes 
subsumption. 

20. (Original) The system of claim 19, wherein subsumption includes applying an impossibility 
condition to selectively compute edit distance. 

21. (Original) The system of claim 14, and further comprising: 

applying at least a second compression operation to each string; 

determining if any two strings match each other after the second compression 

operation; and 
removing one of the two matching strings from the log. 

22. (Original) The system of claim 21, wherein the first compression operation is character- 
based and the second compression operation is token based. 

23. (Original) The system of claim 22, and further comprising applying subsumption after the 
second compression operation is complete. 



24. (Original) The system of claim 23, wherein the subsumption operation is repeated for the 
log. 

25. (Original) The system of claim 14, and further comprising training a statistical process 
with the compressed log. 



